A Geometric Perspective on Speech Sounds

نویسنده

A. Jansen

چکیده

In order to effectively approach high dimensional pattern recognition problems, one seeks to understand and exploit any inherent low dimensional structure. Recently, a number of manifold learning algorithms have been motivated by a geometric point of view that models high dimensional data as lying near a low dimensional submanifold of the original space. Our paper has two main goals: (i) to investigate this manifold assumption for natural speech data. It seems intuitive that a human speech producing apparatus with few degrees of freedom would not produce sounds that fill up the acoustic space. We formalize this intuition by considering a concatenated acoustic tube model of the vocal tract and showing that the sounds generated by such a system lie on a low dimensional curved submanifold of the ambient acoustic space. To the extent that this model captures the essence of human speech production, the manifold assumption is true of natural speech data. (ii) to explore the implications of this geometric point of view towards human speech. We show that the manifold structure of speech sounds may be exploited for dimensionality reduction, semi-supervised learning, and speech representation with sometimes striking perfomance improvements in simulated and real speech data. The non-linear geometry of speech sounds suggests new interpretations of phenomena such as the perceptual magnet effect or quantal theory.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Semi-supervised learning of speech sounds

Recently, there has been much interest in both semi-supervised and manifold learning algorithms, though their applicability has not been explored for all domains. This paper has two goals: (i) to demonstrate semi-supervised approaches based solely on clustering are insufficient for phoneme classification and (ii) to present a new manifold-based semi-supervised algorithm to remedy this shortcomi...

متن کامل

Synthesis: One Vocal Tract Target Configuration Has More than One Sound

Articulatory speech synthesis can be used for speech production research to gain insight into articulation patterns and their acoustic counterparts, the speech sounds. It can be used e.g. to conduct perception experiments that study the relationship between articulation and fine phonetic detail in the acoustic domain. In a case study, we focus on articulatory details in German vowels. Results i...

متن کامل

مقایسه تأثیر درمان مبتنی بر آموزش تولید با آموزش حرکات دهانی غیر گفتاری بر گفتارکودکان 6-4 ساله ی مبتلا به اختلال واجی

Objective: speech sound disorders are among the most common speech disorders in children. Non-speech oral motor exercises have long been used as a facilitative activity throughout therapy sessions for a wide variety of speech disorders by speech-language pathologists. But there are few empirical controlled data to evaluate its effectiveness. This study aimed at comparing the effects of therapeu...

متن کامل

Usability of Non-speech Sounds in User Interfaces

We review the literature on the integration of non-speech sounds to visual interfaces and applications from a usability perspective and subsequently recommend which auditory feedback types serve to enhance human interaction with computers by conveying useful and comprehensible information. We present an overview over varied tasks, functions and environments with a view to establishing the best ...

متن کامل

Speech development and auditory performance in children after cochlear implantation

Abstract Background: The aim of this study was to determine the auditory performance of congenitally deaf children and the effect of cochlear implantation (CI) on speech intelligibility. Methods: Aprospective study was undertaken on 47 children in a pediatric tertiary referral center for CI. All children were deaf prelingually and were younger than 8 years of age. They were followed up until 5...

متن کامل

ذخیره در منابع من

ذخیره در منابع من قبلا به منابع من ذحیره شده

{@ msg_add @}

با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره شماره

صفحات -

تاریخ انتشار 2005

A Geometric Perspective on Speech Sounds

نویسنده

چکیده

منابع مشابه

Semi-supervised learning of speech sounds

Synthesis: One Vocal Tract Target Configuration Has More than One Sound

مقایسه تأثیر درمان مبتنی بر آموزش تولید با آموزش حرکات دهانی غیر گفتاری بر گفتارکودکان 6-4 ساله ی مبتلا به اختلال واجی

Usability of Non-speech Sounds in User Interfaces

Speech development and auditory performance in children after cochlear implantation

عنوان ژورنال:

اشتراک گذاری